Improved Phoneme-History-Dependent Search Method for Large-Vocabulary Continuous-Speech Recognition
نویسندگان
چکیده
This paper presents an improved phonemehistory-dependent (PHD) search algorithm. This method is an optimum algorithm under the assumption that the starting time of a recognized word depends on only a few preceding phonemes (phoneme history). The computational cost and the number of recognition errors can be reduced if the phoneme-historydependent search uses re-selection of the preceding word and an appropriate length of phoneme histories. These improvements increase the speed of decoding and help to ensure that the resulting word graph has the correct word sequence. In a 65k-word domain-independent Japanese read-speech dictation task and 1000-word spontaneous-speech airline-ticket-reservation task, the improved PHD search was 1.2–1.8 times faster than a traditional word-dependent search under the condition of equal word accuracy. The improved search reduced the number of errors by a maximum of 21% under the condition of equal processing time. The results also show that our search can generate more compact and accurate word graphs than those of the original PHD search method. In addition, we investigated the optimum length of the phoneme history in the search. key words: speech recognition, search algorithm, multi-pass search, word graph, phoneme-history-dependent search
منابع مشابه
Speech Input Acoustic Analysis Phoneme Inventory Pronunciation Lexicon Language Model
This paper gives an overview of an architecture and search organization for large vocabulary, continuous speech recognition (LVCSR at RWTH). In the rst part of the paper, we describe the principle and architecture of a LVCSR system. In particular, the issues of modeling and search for phoneme based recognition are discussed. In the second part, we review the word conditioned lexical tree search...
متن کاملSpeech Input Acoustic Analysis Phoneme Inventory Pronunciation Lexicon
This paper gives an overview of an architecture and search organization for large vocabulary, continuous speech recognition (LVCSR at RWTH). In the rst part of the paper, we describe the principle and architecture of a LVCSR system. In particular, the issues of modeling and search for phoneme based recognition are discussed. In the second part, we review the word conditioned lexical tree search...
متن کاملLook-ahead Techniques for Improved Beam Search
This paper presents two look-ahead techniques for large vocabulary continuous speech recognition. These two techniques, which are referred to as language model look-ahead and phoneme look-ahead, are incorporated into the pruning process of the time-synchronous one-pass beam search algorithm. The search algorithm is based on a tree-organized pronunciation lexicon in connection with a bigram lang...
متن کاملSpoken Term Detection for Persian News of Islamic Republic of Iran Broadcasting
Islamic Republic of Iran Broadcasting (IRIB) as one of the biggest broadcasting organizations, produces thousands of hours of media content daily. Accordingly, the IRIBchr('39')s archive is one of the richest archives in Iran containing a huge amount of multimedia data. Monitoring this massive volume of data, and brows and retrieval of this archive is one of the key issues for this broadcasting...
متن کاملAn efficient search method for large-vocabulary continuous-speech recognition
This paper proposes an efficient method for largevocabulary continuous-speech recognition, using a compact data structure and an efficient search algorithm. We introduce a very compact data structure DAWG as a lexicon to reduce the search space. We also propose a search algorithm to obtain the N-best hypotheses using the DAWG structure. This search algorithm is composed of two phases: “forward ...
متن کامل